Connecting To The Server To Fetch The WebPage Elements!!....
MXPlank.com MXMail Submit Research Thesis Electronics - MicroControllers Contact us QuantumDDX.com



Search The Site





 

Neural Descriptor Fields: SE(3)-Equivariant Object Representations for Manipulation


Task demonstrations are an intuitive way of communicating complex tasks to a robot. A recent study on arXiv.org tries to build a robotic system that can learn pick-and-place tasks for unseen objects in a data-efficient manner.



Robotic grippers. Image credit: Ars Electronica via Flickr, CC BY-NC-ND 2.0

Researchers propose a novel method, called Neural Descriptor Fields, to encode dense correspondence across object instances. The coordinate frames are associated with a local geometric structure using a rigid set of query points represented as an SE(3) pose. Dense descriptors that both generalize across instances and SE(3) configurations are developed. That allows applying the approach to novel objects in both novel rotations and translations, where 2D dense descriptors are insufficient.

Neural Descriptor Fields enables both pick and place of unseen object instances in out-of-distribution configurations with a success rate above 85% while using only ten expert demonstrations.

We present Neural Descriptor Fields (NDFs), an object representation that encodes both points and relative poses between an object and a target (such as a robot gripper or a rack used for hanging) via category-level descriptors. We employ this representation for object manipulation, where given a task demonstration, we want to repeat the same task on a new object instance from the same category. We propose to achieve this objective by searching (via optimization) for the pose whose descriptor matches that observed in the demonstration. NDFs are conveniently trained in a self-supervised fashion via a 3D auto-encoding task that does not rely on expert-labeled keypoints. Further, NDFs are SE(3)-equivariant, guaranteeing performance that generalizes across all possible 3D object translations and rotations. We demonstrate learning of manipulation tasks from few (5-10) demonstrations both in simulation and on a real robot. Our performance generalizes across both object instances and 6-DoF object poses, and significantly outperforms a recent baseline that relies on 2D descriptors.